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ABSTRACT 

Two optimalization models for the construction of 
tests with a maximal value of coefficient alpha are given. Both 
models have a linear form and can be solved by using a 
branch-and-bound algorithm. The first model assumes an item bank 
calibrated under the Rasch model and can be used, for instance, when 
classical test theory has to serve as an interface between the item 
bank system and a user not familiar with modern test theory. 
Maximization of alpha was obtained by inserting a special constraint 
in a linear programming model. The second model has wider 
applicability and can be used with any item bank for which estimates 
of the classical item parameter are available. The models can be 
expanded to meet practical constraints with respect to test 
composition. An empirical study with simulated data using two item 
banks of 500 items was carried out to evaluate the model assumptions. 
For Item Bank 1 the underlying response was the Rasch model, and for 
Item Bank 2 the underlying model was the three-parameter model. An 
appendix discusses the relation between item response theory and 
classical parameter values and adds the case of a multidimensional 
item bank. Three tables present the simulation study data. (SLD) 
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Abstract 



Two optimization models for the construction of tests with a 
maximal value of coefficient alpha are given. Both models 
have a linear form and can be solved using a branch-and-bound 
algorithm. The first model assumes an item bank calibrated 
under the Rasch model and can be used, for instance, when 
classical test theory has to serve as an interface between 
the item bank system and a user not familiar with modern test 
theory. The second model has wider applicability and can be 
used with any item bank for which estimates of the classical 
item parameters are available. The models can be expanded to 
meet practical constraints with respect to test composition. 
An empirical study with simulated data was carried out to 
evaluate the model assumptions. 
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Algorithmic Test Design 
Using Classical Item Parameters 

A useful phenomenon in educational and psychological measure- 
ment is the construction of customized tescs from item banks. 
An item bank is a large collection of test items all 
measuring the same ability or domain of knowledge, stored in 
a computer together with empirical estimates of their 
properties . The fact that the item properties are known 
allows the test constructor to have explicit control of them 
and select optimal tests. 

If the properties of the items are modeled using an item 
response theory (IRT) model, estimates of parameters repre- 
senting such properties as item difficulty, discriminating 
power, and the effect of random guessing are stored in the 
item bank. Although item selection can be based on these 
parameter values, a more advanced procedure uses the item and 
test information function from IRT. Birnbaum (1968) and Lord 
(1980) suggested a procedure in which the test constructor 
first specifies a target for the test information function 
and then selects the items such that the sum of their 
information functions meets the target. Theunissen (1985) 
presented c zero-one programming model for selecting a test 
of minimal length subject to the condition that its 
information function is not below the target. The model can 
be solved using one of the branch-and-bound algorithms 
available in the literature (e.g . Wagner, 1975). Alternative 
models and procedures have been given by Adema (1988). 

( 
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Boekkooi-Timminga (1987), Boekkooi-Timminga and van der 
Linden (1987). Theunissen (1986), Theunissen and Verstralen 
(1986) . van der Linden (1987) . and van der Linden and 
Boekkooi-Timminga (1988a. 1988b). 

Although item banking and IRT are natural partners (van 
der Linden, 1986a), this does not necessarily imply that test 
construction has to be based solely on infcyrmation functions. 
The following two examples refer to practical cases in which 
item selection based on parameters from classical test theory 
may be helpful : 

1. The item bank has been calibrated under an IRT model.; 
but some of the users are not familiar with the theory 
and want to have the option of using classical item 
parameters. In such cases, it is possible to use the 
classical test theory as an interface between the item 
bank system and it3 users. The system then predicts the 
classical parameter values for the population of 
exami nees concerned ( see the Appe ndix ) enabl ing the 
users to select tests with optimal values for the 
parameters . 

2. The examinees are sampled from a population with a fixed 
ability distribution . Therefore , for certain appli- 
cations the use of classical item parameters may be 
feasible. For example, the classical index k roughly 
orders the difficulties of the items for a random 
examinee, and the availability of estimates of the item 
TT values is considered as sufficient to base test 
construction on. Item banking with classical item 

ERLC S 
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parameters is dealt with extensively in de Gruijter 
(1986). 

The present paper was motivated by an item banking 
project, in which the need of the option ir the former example 
was felt. The first test construction model presented below 
deals with this case . The second model is more general and 
also has applicability in other cases waere item selection is 
based on classical parameters . Both models are zero-one 
programming models that maximize (a linearized version of ) 
the well-known coefficient alpha. Results from an empirical 
study with simulated data to /erify the model assumptions 
follow the presentation of the models. 

Maximal Test Reliability as a Classical Goal 

A classical goal in test construction is maximization of the 
reliability of the test for a given population of examinees. 
Since the reliability coefficient can only be estimated from 
hard-to-realize replicated measurements. in practice 
coefficient alpha, a well-known and simple lower bound to the 
test reliability, is mostly used. 

Let 0^2 denote the variance of the scores on item i for 
the given population, and let p^^ represent the item-test 
correlation. For a test of n items., coefficient alpha is 
defined as: 

(1) a = n(n - ir^[l - (1 )d:^ ] 

i = l 1 A 
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(Lord & Novick, 1968, sects. 4.4). Since 

(2) 4 = (I^^^ a,p,x)2 

(Lord & Novick, 1986, sect. 15.3), the right-hand term in the 
bracketed expression is equal to 

(3) o2)(2f a p.yT^ . 
i=l 1 i=:l 1 

'For a test of fixed length,, maximization of alpha is equi- 
valent to minimization o£ (3), 

A zero— one programming model for maximization of alpha 
can now be formulated as follows. For each item i = 1, I 
the decision variable Xj^ is defined: 

10 if item i is not in the test 
1 if item i is in the test. 

A maximal value of alpha is obtained for a solution to 
the following problem: 

I 2 ^ _o 

(5) minimize (X OTx. )(Z a p. ^x. ) . 

i=l 1 1 i=l 1 1 

subject to 
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(7) € {0. 1) , i = 1 I. 

Although the model is of the zero-one type, in has a 
nonlinear objective function. Efficient algorithms for 
solving such models are not known (Garfinkel & Jemhauser. 
1972). In addition, a minor problem in (5) is the dependency 
PiX unknown test score. As is usual in classical 

test construction, this problem will be ignored. Also, for an 
item bank system with an underlying IRT model, it is easy to 
predict the correlation between the ir.eT. score and the 
number-right score for the complete bank. This constant could 
be substituted for p^x (5)- 

Two alternative linear models will be give-, for which 
practical algorithms do exist In the first model. a 
condition for alpha to be maximal is inserted as a linear 
constraint into the model. The condition can be shown to 
apply for an item bank calibrated under che Rasch (1980) 
model. The second model does not assume any IRT model. In 
this model, a linearized part of (3) is used as objective 
function, whereas the remaining part serves as a linear 
constraint. The two models will now be derived. 

Maximal Alpha as a Linear Constraint 

For the two-parameter normal-ogive model, a simple relation 
between the discrimination parameter and the item-ability 
correlation exists. Also, it is known that the logistic 
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function approximates the normal ogive excellently. On thes< 
two findings the following derivation of a sufficient 
condition for alpha to be maximal is based. 

Derivation of the Condition 

For an ability 9, the normal-ogive model is defined as: 



where a^^ and bj^ are the parameters for the discriminating 
power and difficulty of item i, and pi(8) is the probability 
of a correct response on i for an examinee with ability 6, 

If latent response variables Y . i = 1 I, are 

assumed such that Yj^ > yj^ generates a correct response, but 
Yi < Yi an incorrect one. and the distributions of Y^ given 8 
are normal with linear regression functions and 
homoscedasticity . the following relation exists between a^ 
and the item--ability (biserial) correlation p,^: 



(Lord & NovicJc. 1968. sects. 16.8 - 16.10). Since, for a 
scale factor 1/1.7 in (8 ) . the logistic and normal-ogive 
curves are known to approximate each other by less than 0.01 
uniformly in 8 (Haley, in Lord & Novick, 1968, sect. 17.2: 
for improvements on this well-known result, see Molenaar, 




(9) 



P,0 = a.d.a. ) 



2,-1/2 
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1974) and the Rasch model is a logistic model with aj^ = 1 for 
all items, it follows that in the Rasch model p.ri :s 
approximately constant. Hence, if for all items p^x 
same relation to p^Q, it is also a constant. In this case (2) 
reduces to 

9 n o 

(10) = c( a. r . 

with c > 0. It follows that 

(11) 4 = + 2;^. a a ) 

^ 1=1 1 J-'^O ^ 3 

Substituting this result into (3), yields 

(12) c"^{i + (I oo.)if^ a^r^r^. 

1 J 1=1 1 

Observe that (12) now is invariant under multiplication of 

(aj., .... a^) by c constant. Without loss of generality,, 
n 2 

Z o. can therefore be taken to be equal to a constant 
i = l 

k > 0. This shows that minimization of (12) amounts to 

maximization of Zi^tj ^i^y However. (11) implies that this 

is also equivalent to maximization of a^. Hence, it follows 

from (10) that (3) has a minimum for the value of 
(CTj. .... cr^) maximizing 

(13) f^^ a, 

Er|c ) 3 
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under the condition than 

(14) I ^ a/ = k. 
i=l ^ 

Maximization using Lagrange multipliers results in the 
following system of equations: 



(15) 1 + 2Xo^ =0, i = 1 n; 



(16) a.^ - k = 0. i = 1 

i=l 1 



Since k is arbitrary, the system is solved for 



(17) 0 < = . . . = a^. 



Thus, provided the assumptions leading to (10) are satisfied, 
coefficient alpha is maximal if and only if 



(18) 0 < Xi = . . . = Zn < 1 



where Xi = ^i or 1 - , and is the classigral difficulty 

parameter for item i. Without loss of generality, in the 

following only the case of equal values will be 
considered. 
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Dependent on the composition of the item bank, the 
scxut-ion need not be unique and simultaneous optimization 
with respect to another goal can be possible. This result is 
used in the following linear programming (LP) modal. 

LP Model 

It is assximed that the estimates of the item k values are 
rounded to a significant digit such that larger classes of 
items with the same rounded value exist. The sets of indices 
of items in the same class are denoted as ...... I j Ij . 

As an example of simultaneous optimization with respect to a 
second goal, it is assumed that realistic estimates of the 
time needed to solve the items in the bank exist and that the 
goal is to minimize the total administration time needed for 
the test. Let tj^ be such an estimate for item i, e.g., an 
estimate of the 95th percentile in the distribution of time 
needed to solve item i for the given population. 

The following linear model realizes (18) at the same 
time minimizing the total administration time ot the test: 

(19) minimize X t. x. 

i=l 1 1 

subject to 



(20) X.^j.x. -ny. = 0. 



0 = 1 J. 
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(21) X. n. 
i=l 1 

(22) X.. € {0, 1), i = 1 I. 

j = 1, . J. 

The additional decision variable Yj the model indicates 
from which class the items are selected. The constraints in 
(21) guarantees that exactly n items are selected from the I 
items in the bank. The constraints in (20) and (21) together 
allow yj to take the value one exactly once. The model in 
(19) through (22) is linear and can be solved by a standard 
branch-and-bound algorithm from the operations research 
literature. Adema (1988) gives a modified branch-and-bound 
procedure that reduces the CPU-time needed for a standard 
procedure considerably. 

The above model is too simple to deal with most test 
construction problems. In practice, usually various kinds of 
restrictions with respect to, e.g.. item content, 
simultaneous inclusion of different items, or ranges of 
possible item-paramet<^r values exist. This point will be 
taken up a^ter the presentation of an alternative model. 

A Linearized Version of Alpha as Objective Function 

In the previous model, a maximal value of alpha was realized 
>-/ adopting a special constraint in the model. The following 
model explicitly maximizes alpha by a direct attack of (3). 
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Inspection of (5) shows that both of its s\ims are linear 
in the decision variable^. This suggests an approach in which 
one of these expressions is used as objective function and 
the other as a constraint. Since for a wide range of possible 
values of n, the numerator of (3) varies less than the 
ajnotninator . alpha can be expected to depend stronger on tt ^ 
latter. This efJect is verified empirically in Ebel (1967). 
Therefore it seems ser^sible to maximize the denominator of 
(3) constraining the numerator to a low value. This is 
realized in the following model 



(23) maximize X a. p-v x. 

i=l I'^iX 1 



subject to 



(24) a.^x. < V. 

i=l 1 1 



(25) I X. = n. 
i = l ^ 



(26) X. € {0. 1), i = 1,. I, 

where v > 0 is a constant. Again, the model is linear and can 

be solved for (x^, xj) by one of the branch-and-bound 

algorithms referred to earlier. 

The choice of a value for v can be motivated as follows. 
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J 2 

The maximal possible value of L ^ a in the model is equal 
to n/4. In addition, the numerator and denominator of (3) 
have Gj^ as a common factor. Therefore, if v approaches its 
maximum, a maximal value will be found, but at the same time 
the numerator will tend to be too large. On the other hand, 
if v approaches its minimum, a minimal value for the 
numerator will be attained but at the cost of a constrained 
denominator. Now this is due not only to the common factor 
Cj^. but also to a restriction-of-range effect on Pix- Hence, 
the optimal value of v will tend to be closer to n/4 than to 
zero. This issue will be pursued further in the section on 
empirical results below. It should be noted that by varying v 
all possible tests of length n can be produced as a solution 
to the model. So in principle the structure of the model does 
not preclude any possible test from showing up as optimal. 

Possible Additional Constraints 

As already noted, in ordei to solve most practical test 
cor.struction problems, the above models have to be made more 
realistic. For example, a test constructor may want to have 
control of such features of the test as its validity with 
respect to several domains of content represented in the item 
bank. This and all other possible demands can be adopted in 
the above models, provided they are formulated as linear 
constraints. The models can then still be solved by the same 
class of algorithms and the solution always automatically 
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meets the new constraints. An extensive review of possible 
constraints from the practice of test construction that can 
be formulated in a linear form is given in van der Linden and 
Boekkooi-Timminga (1988b). The review includes constraints 
controlling the composition of the test with respect to 
behavioral dimensions and item content and format; item 
parameters like administration time, frequency of previous 
administrations and item difficulty: curriculum differences 
between groups; inclusion or exclusion of individual items; 
and dependencies between the items. The following example 
illustrates some of the possibilities. 



Example 

A test with maximal value of coefficient alpha has to be 
constructed from a Physics item bank. From each of the topics 
p = 1. P covered by sets of items. Vp,; in the bank, the 

test constructor v^ants np items in the test. The items have 
also been classified with respect to a behavioral dimension 
(e.g.. knowledge of facts, concepts, application of rules) 

and from each of the sets Vq. q = 1 Q. at least nq 

items should be in the test. The estimated time in minutes 
needed to solve the items in the bank, tj^, i = 1., .... I, 
(see above) is known and the total administration time is not 
allowed to exceed T minutes. Also, for each item it has been 
recorded how often it was administered before, and only items 
with a frequency of previous administration, fj^, not larger' 
than one are allowed in the test. Items with a multiple- 
choice format, collected in <'>ubset Vg. should be excluded 
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from the test. Finally, for some special reason the test 
constructor wants item #115 in the test, and items #19 and 
#203 may not be chosen together. All these demands are 
realized in the following model: 



(27) maximize Z a. p.yX. 

i=l 1 lA 1 



sxibject to 



(28) I X. < V. 

i = l ^ ^ 



(29) Ii,Vp-i="p- P = l' 



(30) I y ^ "q- q = 1„ .... Q, 

q ^ 



(31) t. X. < T., 

i=l 1 1 



(32) f.x.< 1„ i = 1 I, 



(33) I.^y X. = 0 , 
s 



(31) Xji5 = 1. 



20 
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(35) 



+ X 



203 



< 1. 



(36) 



X. € {0, 1} , 



1 = 1., 



I. 



It should be noted that when specifying the constants in 
the model, certain relations ought to be obeyed. For 



Further, constraints (32) - (34) do not enter the actual 
optimization procedure; they only reduce the number of 
decision variables. 



Two item banks of 500 items were generated to evaluate the 
model assumptions. For Item Bank 1, the underlying response 
model was the Rasch model Wi.th item parameters drawn from the 
daitribution N(-0.5, 1). For Item Bank 2, the underlying 
model was the 3-parameter model with item parameters and 
bi drawn from the distributions U(0.5,, 1.5) and U(— 3, 1), 
respectively. The guessing parameter Cj^ was set equal to 0.1. 
To estimate the item difficulties, pj^ , and item 
discriminations (i.e. , item-test correlations, where the 
whole item bank is considered as the test), rj^jn,, 1,000 
examinees (G ~ N(0,1)) were generated to answer the items. 

The program Lando (Anthonisse, 1984) was used to solve 
the zero-one programming models on a DEC 2060 compvter. 
Because it takes too much time to find a zero-one solution 
for the model in (23) to (26),, the relaxation of this model 



instance, no feasible solution exists if Z 





p=l ^ ' 



Empirical Validation of Model Assumptions 
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was solved, i.e., the model with the constraints 0 < < 1 , 

1^1,2 I instead of € {0, 1). This could be done. 

because it is known (Dantzig, 1957) that the number of 
fractional values in the solution is not greater than the 
number of constraints. Therefore, the solution to the model 
in (23) to (26) was found by rounding at most two fractional 
values . 

The model assumptions were first verified by comparing 
tests from Item Bank 1 for different values of v and p. The 
number of items in the tests was 20. Table 1 shows the values 
of coefficient a. In this table, a* denotes coefficient alpha 
with item— bank correlation.s replacing item-test correlations. 
' neve as a is the exact value of the coefficient calculated 
after the test was selected. 



Insert Table 1 here 



Table 1 shows T:hat the differences between v.jlues of a 
of tests constructed for different v?'--es of v were small. 
Higher values of v gave the best results. The values of a for 
the model with maximal alpha as a constraint were not as good 
as for the other model. From Table 1 it is also clear that 
the results were worse, the greater the difference between 
the chosen value of p ana .5. The same trend was observed in 
extensive simulations not reported here (see Adema, 1987). 
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Apparently, the assumptions leading to the well-known result 
in (9) or the assumption of p^^ having the same relation to 
PiG items, are not entirely met for arbitrary data 

sets . 

For Item Bank 2 (3-parameter model), only the model with 
a. linearized version for a as objective function was 
applicable. Again, tests were constructed for different 
values of v. The results are displayed in Table 2. 



Insert Table 2 here 



Once more the best results were found for high values of v. 
Therefore, it is possible to choose v maximal so that 
constraint (24) is redundant and can be omitted. 

Because the variances of the items are not as important 
as the item discriminations, the following zero-one 
programming model was also' tried out: 



(37) maximi ze £ P • v x 
i=l 1 



subject to 



(38) 1^ X = n. 
i = l 1 



23 
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(39) X. € {0. 1) . i = 1. 2 I 

In Table 3,, values of a are shown for tests constructed 
vyJth model (23), (25), and (26) (v maximal) and with model 
(37) through (39). The number of items in the tests was 20 or 
40 and the models were applied to both item banks. 



Insert Table 3 here 



Table 3 demonstrates that model (37) to (39) gave very good 
Results. The values of a were as good as for the best choices 
of V in Table 1 and 2 . 

Table 1, 2, and 3 show that it is possible to construct 
tests with item-test correlations replaced by item-bank 
correlations, because generally tests with a high value for 
a* also have a high value for a. 

Discussion 

Two models for maximization of coefficient alpha as a 
function of classical item parameters were presented. 

In the first model maximization of alpha was obtained by 
inserting a special constraint in a linear programming model. 
The fact that minimization of administration time was chosen 
as explicit objective function was just for the purpose of 
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illustration. Measures for other aspects. e.g.. for 
curricular fit of the test or uniform usage of the items in 
the bank (van der Linden 5c Boekkooi-Timminga. in press), 
could also have been optimized. The important point to note 
is that the model allows optimization with respect to two 
different objectives. The model is based on a formalization 
of the intuitive notion that an item bank conforming to the 
Rasch model should consists of items with equal (classical) 
discriminating power. However. the formalization, which 
resulted in (18) as a condition for alpha to be maximal, also 
needed extra assumptions in addition to the Rasch model. As 
shown in Table 1. for items satisfying the condition in (18) 
a tends to decrease if ttj^ deviates from .50. For .30 < tt^ < 
.70 the results are still satisfactory but outside this 
interval a drops relatively quickly Since the data were 
generated under the Rasch model, this phenomenon implies that 
the extra assumptions are not tenable for all possible data 
sets. Therefore. models as in (19) to (22) are only 
recommended for items with values for in this interval. 

The second model i$ universal in the sense that it does 
not assume any IRT model or other assumptions about the 
items. The model is a direct attack of the kernel of alpha in 
( 3) it maximizes the denominator at the same time 
constraining the numerator. Ample experience with the model 
for various types of data has shown that the solution 
invariably produces the maximal value for alpha for v 

T 9 

close to n/4 (maximum of E^^^ af in the model). For example, 
for n = 50C all simulations produced the maximum of alpha 
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for V in the neighborhood of 95% of n/4 . Also, the optiuiai 
value of alpha increases monotonically with v to the point at 
which the maximum is obtained and then shows a monotonic but 
slight decrease. Therefore, for large item banks, n > 500. 
say, it is recommended to set v at its maximal value. 
However, as already observed, the model in (37) through (39) 
almost always produced comparable results. If no additional 
constraints have to be met. this model can be solved by a 
simple algorithm that picks items with the largest values for 
their item-test correlations. For such applications, the 
model is strongly recommended. 

Finally, it is observed that the advantage of a linear 
programming approach to test construction lies not only in 
its power to optimize a tert parameter as coefficient alpha, 
but also in the possibility to include additional practical 
constraints. The example given earlier shows that the 
presence of such constraints easily involves combinatorial 
problems that cannot be solved by hand. 

APPENDIX 

The availability of a item bank system with items calibrated 
under an IRT model allows the possibility to use classical 
test theory as an interface between the system and a user not 
familiar with IRT. For a given population of examinees the 
.'-ystem is able to predict the values of the classical 
parameters for the items . These values can bo used as the 
input of one of the models in the paper, whereafter the 
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system predicts the values for the test parameters of the 
resulting test . 

Jensema (1976) and Lord (1980) deal with the relation 
between IRT and classical parameter values. The following 
summarizes some of the results and adds the case of a 
multidimensional item bank. A complete treatment is given in 
van der Linden (1986b). 

Let Pi (9 ) be the probabi lity of a . ect response on 
item i for an examinee with ability 6 explained by the IRT 
model and let F(9) be the distribution function for the 
population of examinees under consideration. The basic 
equations are: 



The first equation gives the classical item difficulty: the 
second equation uses the property of local independence and 
is necessary to derive the item-test correlation: 



(1) TTi = J pi(e)dF(e) 



(2) TTij = 1 pi(e)pj(e) dFO) 



— oo 



(3) p 



iX 



= a 



iX 



(a .a. 



1 A 



This follows from 



(4) CTi = 17^(1-17^) 
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(5) cTix = ^jC^ij-n^j) . 

2 

(6) ax = EiTTj^d-TTj^) + Ei;ej(T^ij-T^iT^j) . 
Case of Multidimensionality 

Suppose the item bank falls apart into two different sets of 
items and that for each set an IRT model holds. Let 9^ and 0^ 
be the ability parameters spanning the items in each set. 
while F^(e^). F„(e„). and F^„(0^.0„) are now the distribution 
functions for the given population of examinees. If i^ and i^, 
denote an arbitrary item in the two respective sets, then the 
basic equations are 

(7) wi^ = Tpi^ . 



Equation (8) assumes the property of local independence for 
response variables associated with items from different sets. 
The property can be proven to hold as follows: 
Proof. Let {Ui^} and i^i^) be the response variables 
associated with the two sets of items. Since for {U^ an IRT 
model holds, 6^ spans this set completely. Thus, no partition 
of the population of examinees is possible that introduces 
different distributions over (Uj^^) for a given value of 9^. 
Therefore, the values of (^i^) cannot introduce such a 
partition and the variables in (Uj^y) are locally independent 
of those in {Uiv/) • [] 
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Table 1 

Results for tests constructed from Item Bank 1 
(Rasch model: n = 20) 



Maximal a as Objective Maximal a as a Constraint 

V a* a p a* a 



5 


0 


.8096 


.8478 


4 


5 


.8028 


.8413 


4 


0 


.7803 


.8252 


3 


5 


.7491 


.8069 



30 


.6922 


.7866 


40 


. 7245 


.7956 


50 


.7331 


.8004 


60 


.7373 


.7997 


70 


.7136 


.7896 


80 


.6441 


.7559 


90 


.4131 


.6701 



ERIC 



32 



Algorithmic Test Design 

29 



Table 2 

Results for tests constructed from Item Bank 2 
(Three-parameter model; n = 20) 



V 


a' 


a 


5.0 


.8201 


.8579 


4-5 


.8252 


.8607 


4.0 


.8199 


.8551 


3.5 


.8045 


.8460 


3.0 


.7874 


.8386 
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Table 3 



Results 


for tests 


constructed 


with Model (23) 


. (25), and 


(26) 




and Model 


(37) - (39) 


from Item Bank 


1 and 2 








Model (23) 


. (25), (26) 


Model (37) 


- (39) 


Item 












Bank 


n 


a* 


a 


a* 


a 


1 


20 


.8096 


.8478 


.8107 


.8465 


1 


40 


.9013 


.9122 


.9020 


.9122 


2 


20 


.8201 


.8579 


.8256 


.8603 


2 


40 


.9074 


.9188 


.9096 


.9196 
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